Density-based clustering of short-text corpora∗ Agupamiento de textos cortos basado en densidad

نویسندگان

  • Diego A. Ingaramo
  • Marcelo L. Errecalde
  • Paolo Rosso
چکیده

In this work, we analyse the performance of different density-based algorithms on short-text and narrow domain short-text corpora. We attempt to determine to what extent the features of this kind of corpora impact on the density computation of the clusterings obtained and how robust these algorithms to the different complexity levels are.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Particle Swarm Optimizer to Cluster Parallel Spanish-English Short-text Corpora Un Optimizador basado en Cúmulo de Part́ıculas para el Agrupamiento de Textos Cortos de Colecciones Paralelas en Español-Inglés

Short-texts clustering is currently an important research area because of its applicability to web information retrieval, text summarization and text mining. These texts are often available in different languages and parallel multilingual corpora. Some previous works have demonstrated the effectiveness of a discrete Particle Swarm Optimizer algorithm, named CLUDIPSO, for clustering monolingual ...

متن کامل

Clustering Iterativo de Textos Cortos con Representaciones basadas en Conceptos

Resumen La tendencia actual a trabajar con documentos cortos (blogs, mensajes de textos, y otros), ha generado un interés creciente en las técnicas de procesamiento automáticas de documentos con estas caracteŕısticas. En este contexto, el “clustering” (agrupamiento) de textos cortos es un área muy importante de investigación, que puede jugar un rol fundamental en organizar estos grandes volúmen...

متن کامل

Performance analysis of Particle Swarm Optimization applied to unsupervised categorization of short texts Análisis de Prestación de Particle Swarm Optimization aplicado a Categorización no Supervisada de Textos Cortos

Nowadays there is a need to access to on line information such as abstracts, news, opinions, evaluations of products, etc. That information is generally available on the web as short texts. Previous works have demonstrated the effectiveness of a discrete Particle Swarm Optimization algorithm, named CLUDIPSO, for clustering small short-text corpora. This article presents a preliminary study abou...

متن کامل

Minería de opiniones centrada en tópicos usando textos cortos en español

Users express their feelings about an entity of a specific topic in a free way using short texts on social networks. Sentiment analysis, also known as opinion mining, focuses on examining these texts to determine their polarity. This article presents an approach to the mining of opinions based on topics from Twitter texts in Spanish. The main objective is to decide the polarity of a text, deter...

متن کامل

Un Análisis Comparativo de Estrategias para la Categorización Semántica de Textos Cortos

Nowadays, short-texts categorization is an important research area because most of the information we usually receive and work with have this characteristic (e-mails, text messages, news, etc.). Different studies have reported interesting results in text categorization by adding semantic information to documents’ representation. However, these studies have not focused on the particularities tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008